SDM with R
SDM with R
1st QCBS R Symposium
Outline
This tutorial illustrates how to build, evaluate and project species distribution models within R. The main steps we will go through, described bellow, are the following:
- R environment preparation
- Data preparation
- Study area and manipulating spatial information
- Environmental data
- Species occurrence data
- Presence-and-absence data
- Presence-only and pseudo-absences data
- Model fitting, prediction, and evaluation
- Projecting models
- Multi-species modelling
- DIY: Create a SDM yourself!
- Bonus
R environment preparation
Download the .zip file from here and extract files to your favorite QCBS R Symposium workshop folder.
To run this workshop, please use the latest version of R. A certain number of libraries are also required and will be downloaded. Install and load all required R libraries:
install.packages("rgdal")
install.packages("raster")
install.packages("rgeos")
install.packages("dismo")
install.packages("letsR")
install.packages("biomod2")
install.packages("biogeo")
install.packages("maptools")
install.packages("stringr")
install.packages("ggplot2")
install.packages("plotly")
install.packages("xtable")
install.packages("tidyr")library(rgdal)
library(raster)
library(rgeos)
library(dismo)
library(letsR)
library(biomod2)
library(biogeo)
library(maptools)
library(stringr)
library(ggplot2)
library(plotly)
library(xtable)
library(tidyr)Set your working directory to the same directory of the folder workshopSDM using the function setwd() or your prefered method.
Finally, we recommend to save and create a new working environment for this tutorial.
# save all the working space
save.image("QCBS_SDMTutorial.RData")
# free the working space
rm(list = ls())
# and get it back
load("QCBS_SDMTutorial.RData")Data preparation
Selecting your study area and manipulating spatial files
Most species distribution models are done using spatial grid information. In the next step, we will learn how to create a polygon grid of a given region, which will be used within our species distribution model framework.
For this tutorial, we have chosen the Neotropical zoogeographical region as our study area. Load the polygon shapefile of it using the readOGR function:
neotropical_shape <- rgdal::readOGR("data/shapefiles/neotropical.shp")OGR data source with driver: ESRI Shapefile
Source: "data/shapefiles/neotropical.shp", layer: "neotropical"
with 1 features
It has 3 fields
plot(neotropical_shape)Now that we have our study region, we need to create a grid, intersect our shapefile and create a new grid of that region. For this, we will use the modified GridFilter function (originally from Hidasi-Neto, J.). It allows us to input a polygon shapefile, specify a resolution and the proportion of overlap for each cell to be considered.
We will apply the function to the polygon neotropical_shape you have loaded.
GridFilter <- function(shape, resol = 1, prop = 0) {
grid <- raster(extent(shape))
res(grid) <- resol
proj4string(grid) <- proj4string(shape)
gridpolygon <- rasterToPolygons(grid)
drylandproj <- spTransform(shape, CRS("+proj=laea"))
gridpolproj <- spTransform(gridpolygon, CRS("+proj=laea"))
gridpolproj$layer <- c(1:length(gridpolproj$layer))
areagrid <- gArea(gridpolproj, byid = T)
gridpolproj <- gBuffer(gridpolproj, byid = TRUE, width = 0)
dry.grid <- intersect(drylandproj, gridpolproj)
areadrygrid <- gArea(dry.grid, byid = T)
info <- cbind(dry.grid$layer, areagrid[dry.grid$layer], areadrygrid)
dry.grid$layer <- info[, 3]/info[, 2]
dry.grid <- spTransform(dry.grid, CRS(proj4string(shape)))
dry.grid.filtered <- dry.grid[dry.grid$layer >= prop, ]
}
# Create a spatial polygon grid for the Neotropical region, with 5 degrees x
# 5 degrees
neotropical_grid <- GridFilter(neotropical_shape, resol = 5, prop = 0.5)
## Create an ID field for it, which will be useful in the future
neotropical_grid$ID <- 1:(length(neotropical_grid))# Export your resulting polygon grid
writeOGR(neotropical_grid, dsn = paste(getwd(), "/data/shapefiles", sep = ""),
layer = "neotropical_grid_5", driver = "ESRI Shapefile", overwrite_layer = T)This is our resulting polygon grid, where each cell refers to a site (and a row) in our data.
neotropical_grid <- readOGR("data/shapefiles/neotropical_grid_5.shp")OGR data source with driver: ESRI Shapefile
Source: "data/shapefiles/neotropical_grid_5.shp", layer: "neotropical_grid_5"
with 63 features
It has 5 fields
plot(neotropical_grid)Importing environmental variables
In species distribution modeling, predictor variables are typically extracted from raster (grid) type files. Each ‘raster’ should represent a given variable of interest, which can be climatic, soil, terrain, vegetation, land use, and other types variables.
The getData function from the dismo package is able to retrieve useful climate, elevation and administrative data.
bioClimVars <- getData("worldclim", var = "bio", res = 10)
bioClimVarsYou can also load your predictor variables from any directory within your computer using the raster (single file) and stack (loads multiple files) functions:
rasterFiles <- list.files("data/rasters/bio_10m_bil", pattern = "bil$", full.names = TRUE)
rasterFiles [1] "data/rasters/bio_10m_bil/bio1.bil"
[2] "data/rasters/bio_10m_bil/bio10.bil"
[3] "data/rasters/bio_10m_bil/bio11.bil"
[4] "data/rasters/bio_10m_bil/bio12.bil"
[5] "data/rasters/bio_10m_bil/bio13.bil"
[6] "data/rasters/bio_10m_bil/bio14.bil"
[7] "data/rasters/bio_10m_bil/bio15.bil"
[8] "data/rasters/bio_10m_bil/bio16.bil"
[9] "data/rasters/bio_10m_bil/bio17.bil"
[10] "data/rasters/bio_10m_bil/bio18.bil"
[11] "data/rasters/bio_10m_bil/bio19.bil"
[12] "data/rasters/bio_10m_bil/bio2.bil"
[13] "data/rasters/bio_10m_bil/bio3.bil"
[14] "data/rasters/bio_10m_bil/bio4.bil"
[15] "data/rasters/bio_10m_bil/bio5.bil"
[16] "data/rasters/bio_10m_bil/bio6.bil"
[17] "data/rasters/bio_10m_bil/bio7.bil"
[18] "data/rasters/bio_10m_bil/bio8.bil"
[19] "data/rasters/bio_10m_bil/bio9.bil"
# Stack the bioclimatic variables
bioClimVars <- stack(rasterFiles)
bioClimVarsclass : RasterStack
dimensions : 900, 2160, 1944000, 19 (nrow, ncol, ncell, nlayers)
resolution : 0.1666667, 0.1666667 (x, y)
extent : -180, 180, -60, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs
names : bio1, bio10, bio11, bio12, bio13, bio14, bio15, bio16, bio17, bio18, bio19, bio2, bio3, bio4, bio5, ...
min values : -269, -97, -488, 0, 0, 0, 0, 0, 0, 0, 0, 9, 8, 72, -59, ...
max values : 314, 380, 289, 9916, 2088, 652, 261, 5043, 2159, 4001, 3985, 211, 95, 22673, 489, ...
You can also plot your variables to take a look at them:
plot(bioClimVars/10) # WorldClim data is multiplied by 10. Let's take a look at the real information.Variable extraction
Before extracting variables, it is really important that your rasters have the same projection (i.e. type of coordinate reference system) as your other spatial files. See ?CRS for help in finding CRS definitions.
crs.geo <- CRS("+proj=longlat +ellps=WGS84 + datum=WGS84") # geographical, datum WGS84
neotropical_grid <- spTransform(neotropical_grid, crs.geo) # define projection system of grid data
proj4string(bioClimVars) <- crs.geo # and of our bioclimatic rastersNow, we can extract the predictor values for each cell of our grid!
myExpl <- data.frame(bio_1 = numeric(length(neotropical_grid)), bio_2 = numeric(length(neotropical_grid)),
bio_3 = numeric(length(neotropical_grid)))
# bio_1_ext is a list of cells that contains all values for each cell, for
# each predictor
bio_1_ext <- raster::extract(bioClimVars$bio1, neotropical_grid)
bio_2_ext <- raster::extract(bioClimVars$bio2, neotropical_grid)
bio_3_ext <- raster::extract(bioClimVars$bio3, neotropical_grid)
myExpl$bio_1 <- unlist(lapply(bio_1_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
myExpl$bio_2 <- unlist(lapply(bio_2_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
myExpl$bio_3 <- unlist(lapply(bio_1_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
write.table(myExpl, "data/matrices/NT_EnvVar_5.txt")
# Let's take a look at our table:
head(myExpl) bio_1 bio_2 bio_3
1 215.9267 138.49084 215.9267
2 236.4301 114.60676 236.4301
3 241.8685 99.48434 241.8685
4 239.1822 91.49031 239.1822
5 235.9922 99.52013 235.9922
6 269.0044 101.91778 269.0044
myExpl <- data.frame(bio_1 = numeric(length(neotropical_grid)), bio_4 = numeric(length(neotropical_grid)),
bio_8 = numeric(length(neotropical_grid)), bio_12 = numeric(length(neotropical_grid)),
bio_15 = numeric(length(neotropical_grid)), bio_18 = numeric(length(neotropical_grid)))
bio_1_ext <- raster::extract(bioClimVars$bio1, neotropical_grid) # bio_1_ext is a list of cells that contains all values for each cell, for each predictor
bio_4_ext <- raster::extract(bioClimVars$bio4, neotropical_grid)
bio_8_ext <- raster::extract(bioClimVars$bio8, neotropical_grid)
bio_12_ext <- raster::extract(bioClimVars$bio12, neotropical_grid)
bio_15_ext <- raster::extract(bioClimVars$bio15, neotropical_grid)
bio_18_ext <- raster::extract(bioClimVars$bio18, neotropical_grid)
myExpl$bio_1 <- unlist(lapply(bio_1_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
myExpl$bio_4 <- unlist(lapply(bio_4_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
myExpl$bio_8 <- unlist(lapply(bio_8_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
myExpl$bio_12 <- unlist(lapply(bio_12_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
myExpl$bio_15 <- unlist(lapply(bio_15_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
myExpl$bio_18 <- unlist(lapply(bio_18_ext, function(x) if (!is.null(x)) mean(x,
na.rm = TRUE) else NA))
write.table(myExpl, "data/matrices/NT_EnvVar_5.txt")
# Let's take a look at our table:
head(myExpl)Importing species data
Using expert drawn maps
One way of acquiring species distribution information is from expert-drawn maps, such as the ones from the IUCN Red List for Threatened Species database. These maps Tare usually done used multiple methods: only on occurrence data (e.g. a polygon around known occurrences), whereas others incorporate multiple degrees of knowledge, such as habitat requirements or elevational limits of the species, in essence using an informal distribution modelling approach.
Richness maps derived from expert-drawn range maps may overestimate local richness (‘error of commission’), in relation to point-to-grid richness maps. This because they are generally drawn to include all areas where a species is known to occur without excluding areas in between where the species may not exist. They tend to map the ‘extent of occurrence’ of species that includes the, perhaps much smaller, âarea of occupancyâ (Loislle et al., 2003; Habib et al., 2004; Hurlbert & White, 2005; Graham & Hijmans, 2006).
We advise caution when using them.
# Load your species map file
speciesGeoDist <- readShapePoly("data/shapefiles/NT_TERRESTRIAL_MAMMALS_subset.shp",
proj4string = crs.geo)
plot(speciesGeoDist)Create a presence-absence matrix of species’ geographic ranges within your polygon grid shapefile. For this, we will use the function lets.presab.grid from the letsR package.
crs.geo <- CRS("+proj=longlat +ellps=WGS84 + datum=WGS84") # geographical, datum WGS84
proj4string(neotropical_grid) <- crs.geo # define projection system of grid data
proj4string(speciesGeoDist) <- crs.geo # and of our bioclimatic rasters
abspres.matrix <- lets.presab.grid(shapes = speciesGeoDist, grid = neotropical_grid,
sample.unit = "ID", remove.sp = TRUE, presence = NULL, origin = NULL, seasonal = NULL)
# Export your presence-absence matrix
write.table(abspres.matrix$PAM, "data/matrices/NT_PAM_5.txt")You can now visualize a map of your species richness within your polygon grid:
richnessCalc <- rowSums(abspres.matrix$PAM) + 1
colPalette <- colorRampPalette(c("#fff5f0", "#fb6a4a", "#67000d"))
colors <- c("white", colPalette(max(richnessCalc)))
plot(abspres.matrix$grid, border = "gray40", col = colors[richnessCalc])
map(add = TRUE)We also need the coordinates of our centroids to proceed with the data analysis:
# Extract coordinates from the cell centroids
myRespCoord <- as.data.frame(coordinates(abspres.matrix$grid))
colnames(myRespCoord) <- c("Longitude_X", "Latitude_Y")
# Export it for the future
write.table(myRespCoord, "data/matrices/NT_coords_5.txt")Using occurrence records
We can also use occurrence records available online, or collected in the field. Here, we will use information from the Global Biodiversity Information Facility and import it using the function gbif:
D_youngi.GBIF <- gbif("Diaemus", "youngi")# get data frame with spatial coordinates (points)
Dyoungi.Occ <- subset(D_youngi.GBIF, select = c("country", "lat", "lon"))
head(Dyoungi.Occ) # a simple data frame with coordinates country lat lon
1 Nicaragua 11.11199 -85.76016
2 Trinidad and Tobago 10.54603 -61.48618
3 Suriname 1.99449 -56.09211
4 Suriname 1.99000 -56.09000
5 Suriname 1.99400 -56.09200
6 Suriname NA NA
Issues with occurrence points
Occurrence data from museum and herbarium collections are very valuable for mapping biodiversity patterns, and for species distribution modelling. But these datasets contain a lot of errors and are susceptible to many issues relating data quality. Many of these errors are extensively described in “Biogeo: an R package for assessing and improving data quality of occurrence record datasets”, from Robertson et al., 2016
Let’s take a look at a few of them:
| Error | Probable.Reason |
|---|---|
| Points plotted at zero degrees latitude and longitude. | No coordinates were available in the original dataset but values of zero assigned to the coordinates |
| Points in sea for terrestrial species or on land for aquatic species, obvious geographical outliers. | Transposed latitude and longitude coordinates; incorrect sign on the decimal degrees of the latitude or longitude coordinate; degrees and minutes were transposed before the coordinate was converted to decimal degrees; imprecise locality description used to assign coordinates; the specimen was incorrectly identified by the collector or the incorrect name was applied to the species when the data were digitized. |
| Point in sea but close to coast for terrestrial species, or on land but close to coast for marine species. | Low precision coordinates e.g. only degrees were available or the data were originally collected on a coarse-scale grid. Imprecise locality description used to assign coordinates. |
| Point plotted along the prime meridian or equator. | Missing coordinate for latitude or for longitude that was incorrectly assigned a value of zero. |
| Country name given in the record does not correspond with country where point is plotted. | Likely to be the same errors as for b) above. |
| Elevation given in the record does not correspond with elevation obtained from a digital elevation model where point is plotted. | Likely to be the same errors as for b) above, or the spatial resolution of the digital elevation model is too coarse. |
One way of working around these issues is by using the package biogeo, which has many functions for quality assessment and coordinate conversion.
Data cleaning
Let’s take a look at our species matrix. What is going when we look at the sum of the columns (i.e. number of presences)?
# Load our species data
DataSpecies <- read.table("data/matrices/NT_PAM_5.txt", header = TRUE)
colSums(DataSpecies) Panthera.onca Pecari.tajacu
50 55
Peromyscus.leucopus Phyllostomus.latifolius
3 10
Philander.opossum Phyllomys.nigrispinus
39 3
Peromyscus.mexicanus Phyllomys.dasythrix
3 2
Oxymycterus.amazonicus Oxymycterus.hispidus
8 5
Peropteryx.macrotis Pattonomys.carrikeri
47 1
Oxymycterus.josei Peropteryx.pallidoptera
1 5
Oxymycterus.hiska Peromyscus.aztecus
3 3
Peromyscus.beatae Philander.deltae
3 1
Phylloderma.stenops Peromyscus.furvus
43 1
Oxymycterus.rufus Peromyscus.grandis
7 2
Phaenomys.ferrugineus Peropteryx.kappleri
1 44
Phyllomys.blainvillii Phyllomys.brasiliensis
3 2
Peromyscus.levipes Philander.mondolfii
1 5
Philander.frenatus Phyllomys.mantiqueirensis
7 1
Phyllostomus.elongatus Oxymycterus.nasutus
41 5
Peromyscus.hylocetes Oxymycterus.hucucha
1 2
Oxymycterus.roberti Oxymycterus.dasytrichus
7 3
Oxymycterus.angularis Phyllomys.medius
3 3
Peromyscus.megalops Pattonomys.occasius
2 3
Phyllostomus.hastatus Phyllomys.kerri
44 1
Oxymycterus.quaestor Peropteryx.leucoptera
9 28
Phyllomys.thomasi Peromyscus.yucatanicus
1 2
Ozotoceros.bezoarticus Peromyscus.gratus
21 1
Philander.andersoni Pearsonomys.annectens
12 2
Oxymycterus.delator Philander.mcilhennyi
5 4
Peromyscus.maniculatus Peromyscus.gymnotis
1 1
Perognathus.flavus Oxymycterus.paramensis
1 7
Oxymycterus.inca Phyllostomus.discolor
5 46
Peromyscus.difficilis Peromyscus.melanotis
1 1
Phyllomys.lamarum Pattonomys.semivillosus
3 1
Phyllomys.lundi Peromyscus.stirtoni
1 2
Phyllomys.pattoni Phyllomys.sulinus
4 3
Peromyscus.melanophrys Oxymycterus.caparoae
2 2
We can work on that by removing species that occur in less than four sites.
# Remove all columns that have less than 4 presences
to.remove <- colnames(DataSpecies)[colSums(DataSpecies) <= 4]
`%ni%` <- Negate(`%in%`)
DataSpecies <- subset(DataSpecies, select = names(DataSpecies) %ni% to.remove)Sometimes, we also have problems with the separators between species names. We can use the gsub() function to solve this type of issue.
# Replace '.' per '.'
names(DataSpecies) <- gsub(x = names(DataSpecies), pattern = "\\.", replacement = ".")
write.table(DataSpecies, "data/matrices/NT_PAM_5.txt")Species distribution modelling
Initialising and formatting direct input data
First, input data needs to be rearranged for usage in biomod2 using BIOMOD_FormatingData(). Its most important arguments are:
resp.varcontains species data (for a single species) in binary format (ones for presences, zeros for true absences and NA for indeterminated) that will be used to build the species distribution models. It may be a vector or a spatial points object.expl.varcontains a matrix, data.frame, SpatialPointsDataFrame or RasterStack containing your explanatory variables.resp.xytwo columns matrix containing the X and Y coordinates ofresp.var(only consider if resp.var is a vector)resp.namecontains the response variable name (species).
# Recall your variables again:
## Load prepared species data
myResp <- read.table("data/matrices/NT_PAM_5.txt", header = TRUE)
## Load environmental variables
myExpl <- read.table("data/matrices/NT_EnvVar_5.txt", header = TRUE)
# Define the species of interest
myRespName <- names(DataSpecies)[1]
# Load coordinates
myRespCoord <- read.table("data/matrices/NT_coords_5.txt", header = TRUE)myBiomodData <- BIOMOD_FormatingData(resp.var = as.data.frame(DataSpecies[,myRespName]),
expl.var = myExpl,
resp.xy = as.data.frame(myRespCoord),
resp.name = myRespName,
#PA.nb.rep = 2,
#PA.nb.absences = 200,
#PA.strategy = 'random',
na.rm = TRUE)
-=-=-=-=-=-=-=-=-=-=-= Panthera.onca Data Formating -=-=-=-=-=-=-=-=-=-=-=
> No pseudo absences selection !
! No data has been set aside for modeling evaluation
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Done -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Now, you may check if your data is correctly formatted by printing and plotting the myBiomodData object.
myBiomodData
-=-=-=-=-=-=-=-=-=-=-=-=-= 'BIOMOD.formated.data' -=-=-=-=-=-=-=-=-=-=-=-=-=
sp.name = Panthera.onca
50 presences, 13 true absences and 0 undifined points in dataset
6 explanatory variables
bio_1 bio_4 bio_8 bio_12
Min. : 54.06 Min. : 347.3 Min. : 32.76 Min. : 128.6
1st Qu.:182.15 1st Qu.: 521.5 1st Qu.:211.54 1st Qu.:1044.7
Median :236.43 Median :1124.6 Median :247.51 Median :1456.8
Mean :211.69 Mean :1907.0 Mean :217.99 Mean :1545.8
3rd Qu.:256.98 3rd Qu.:3183.6 3rd Qu.:255.88 3rd Qu.:2062.3
Max. :269.00 Max. :5717.2 Max. :271.96 Max. :3146.0
bio_15 bio_18
Min. : 12.76 Min. : 35.97
1st Qu.: 38.23 1st Qu.:207.82
Median : 58.85 Median :362.02
Mean : 56.49 Mean :345.60
3rd Qu.: 73.60 3rd Qu.:458.40
Max. :120.75 Max. :669.79
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
plot(myBiomodData)Building models
Defining algorithm parameters using default options
We are ready for running our set of models on our species. This step is the core of the modeling procedure.
First, we define the of different options for each modeling technique using the function BIOMOD_ModelingOptions(). If we do not change its parameters, it will take the default values (run Print_Default_ModelingOptions() to see them) defined within biomod2.
For example, you could change the parameters of the GLM function to compute quadratic a GLM and select best model with ‘BIC’ criterium:
myBiomodOptions <- BIOMOD_ModelingOptions(GLM = list(type = "quadratic", interaction.level = 0,
myFormula = NULL, test = "BIC", family = "binomial", control = glm.control(epsilon = 1e-08,
maxit = 1000, trace = FALSE)))You could also set your own GLM formula.
# you can prefer to establish your own GLM formula
myBiomodOptions <- BIOMOD_ModelingOptions(
GLM = list( myFormula = formula("Panthera.onca ~ bio3 + log(bio1) + poly(bio2,2) + bio4 + bio1:bio4")But, for the sake of simplicity, we will keep all options default.
myBiomodOption <- BIOMOD_ModelingOptions()
# The created object is then given to BIOMOD_Modeling in the next step.Computing models
Now, we proceed to running the set of models on our species. We may choose between 12 difeerent algorithms (‘GLM’, ‘GBM’, ‘GAM’, ‘CTA’, ‘ANN’, ‘SRE’, ‘FDA’, ‘MARS’, ‘RF’, ‘MAXENT.Phillips’ and ‘MAXENT.Tsuruoka’).
As we do not have evaluation data, we will make 3-fold cross-validation (number controlled by NbRunEval argument) of our models by randomly splitting our data set into 2 subsets DataSplit.
Take a look below:
myBiomodModelOut <- BIOMOD_Modeling(
# BIOMOD_FormatingData()-class object
data = myBiomodData,
# The set of models to be calibrated on the data
models = c('GBM', 'GLM', 'GAM', 'MAXENT.Tsuruoka', 'RF', 'MAXENT.Phillips'),
# BIOMOD.models.options object
models.options = myBiomodOption,
# Number of evaluation runs
NbRunEval=3,
# % of data used to calibrate the models, the remaining part will be used for testing
DataSplit=70,
# If this argument is kept to NULL, each observation (presence or absence) has the same weight (independent of the number of presences and absences).
Prevalence=0.5,
# Number of permutations to estimate variable importance
VarImport=5,
# Types of evaluations methods to be used. The complete list is within ?BIOMOD_Modeling()
models.eval.meth = c('TSS','ROC'),
# Keep all results and outputs on hard drive or not. Keep it always TRUE
SaveObj = TRUE,
# If true, all model predictions will be scaled with a binomial GLM
rescal.all.models = FALSE,
# If true, models calibrated and evaluated with the whole dataset are done.
# Building models with all information available may be usefull in some particular
# cases (i.e. rare species with few presences points).
do.full.models = FALSE,
# character, the ID (=name) of modeling procedure. A random number by default
modeling.id = paste(myRespName,"CurrentClim",sep="")) # We call it 'CurrentClim' because the bioclimatic variables we are working with are from the present climate.
Loading required library...
Checking Models arguments...
! java software seems not be corectly installed
> MAXENT.Phillips modelling was switched off!
Creating suitable Workdir...
> Automatic weights creation to rise a 0.5 prevalence
-=-=-=-=-=-=-=-=-=-=-= Panthera.onca Modeling Summary -=-=-=-=-=-=-=-=-=-=-=
6 environmental variables ( bio_1 bio_4 bio_8 bio_12 bio_15 bio_18 )
Number of evaluation repetitions : 3
Models selected : GBM GLM GAM MAXENT.Tsuruoka RF
Total number of model runs : 15
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-=-=-=- Run : Panthera.onca_AllData
-=-=-=--=-=-=- Panthera.onca_AllData_RUN1
Model=Generalised Boosting Regression
2500 maximum different trees and 3 Fold Cross-Validation
Evaluating Model stuff...
Evaluating Predictor Contributions...
Model=GLM ( quadratic with no interaction )
Stepwise procedure using AIC criteria
selected formula : Panthera.onca ~ bio_4 + I(bio_18^2) + bio_15 + I(bio_12^2)
<environment: 0x0000000023033158>
Evaluating Model stuff...
Evaluating Predictor Contributions...
Model=GAM
GAM_mgcv algorithm chosen
Automatic formula generation...
> GAM (mgcv) modelling...
*** inherits(g.pred,'try-error')
! Note : Panthera.onca_AllData_RUN1_GAM failed!
Model evaluation
Extracting evaluation scores
We are able now to capture our model evaluations using the get_evaluations function. In this tutorial, we have focused on two evaluations methods: Receiving Operator Curves and True Skill Statistic.
# get all models evaluation
myBiomodModelEval <- get_evaluations(myBiomodModelOut) #contains model evaluation scores
# print the dimnames of this object
dimnames(myBiomodModelEval)[[1]]
[1] "TSS" "ROC"
[[2]]
[1] "Testing.data" "Cutoff" "Sensitivity" "Specificity"
[[3]]
[1] "GBM" "GLM" "GAM" "MAXENT.Tsuruoka"
[5] "RF"
[[4]]
[1] "RUN1" "RUN2" "RUN3"
[[5]]
Panthera.onca_AllData
"AllData"
hist(myBiomodModelEval)write.table(myBiomodModelEval, file = "data/results/EvalAll_Model_Eval")Plotting evaluation scores
We can also plot evaluation scores for sensitivity and specifity for all models.
# print the Sensitivity of all models --> TSS and ROC scores for SENSITIVITY
mySensitivity_All_Models <- myBiomodModelEval[, "Sensitivity", , , ]
mySensitivity_All_Models
mySensAll <- as.data.frame(t(as.data.frame(mySensitivity_All_Models)))
MySensHist.TSS <- ggplot(mySensAll, aes(x = TSS)) + geom_histogram(aes(y = ..density..),
binwidth = density(mySensAll$TSS)$bw) + geom_density(fill = "red", alpha = 0.2)
MySensHist.ROC <- ggplot(mySensAll, aes(x = ROC)) + geom_histogram(aes(y = ..density..),
binwidth = density(mySensAll$ROC)$bw) + geom_density(fill = "red", alpha = 0.2)
MySensHist.TSS
MySensHist.ROC
# save results
write.table(mySensitivity_All_Models, file = "data/results/Sensitivity_All_Models"), , RUN1
GBM GLM GAM MAXENT.Tsuruoka RF
TSS 100 100 NA 75 100
ROC 100 100 NA 75 100
, , RUN2
GBM GLM GAM MAXENT.Tsuruoka RF
TSS 100 50 NA 75 100
ROC 100 50 NA 75 100
, , RUN3
GBM GLM GAM MAXENT.Tsuruoka RF
TSS 100 100 NA 100 100
ROC 100 100 NA 100 100
# Proportion of PRESENCES correctly predicted (true positive rate)
sensitivity <- read.table("data/results/Sensitivity_All_Models")
summary(sensitivity) GBM.RUN1 GLM.RUN1 GAM.RUN1 MAXENT.Tsuruoka.RUN1
Min. :100 Min. :93.33 Min. :86.67 Min. :100
1st Qu.:100 1st Qu.:93.33 1st Qu.:86.67 1st Qu.:100
Median :100 Median :93.33 Median :86.67 Median :100
Mean :100 Mean :93.33 Mean :86.67 Mean :100
3rd Qu.:100 3rd Qu.:93.33 3rd Qu.:86.67 3rd Qu.:100
RF.RUN1 GBM.RUN2 GLM.RUN2 GAM.RUN2
Min. :100 Min. :100 Min. :100 Min. :100
1st Qu.:100 1st Qu.:100 1st Qu.:100 1st Qu.:100
Median :100 Median :100 Median :100 Median :100
Mean :100 Mean :100 Mean :100 Mean :100
3rd Qu.:100 3rd Qu.:100 3rd Qu.:100 3rd Qu.:100
MAXENT.Tsuruoka.RUN2 RF.RUN2 GBM.RUN3 GLM.RUN3
Min. :100 Min. :100 Min. :80 Min. :100
1st Qu.:100 1st Qu.:100 1st Qu.:80 1st Qu.:100
Median :100 Median :100 Median :80 Median :100
Mean :100 Mean :100 Mean :80 Mean :100
3rd Qu.:100 3rd Qu.:100 3rd Qu.:80 3rd Qu.:100
GAM.RUN3 MAXENT.Tsuruoka.RUN3 RF.RUN3
Min. :93.33 Min. :93.33 Min. :86.67
1st Qu.:93.33 1st Qu.:93.33 1st Qu.:86.67
Median :93.33 Median :93.33 Median :86.67
Mean :93.33 Mean :93.33 Mean :86.67
3rd Qu.:93.33 3rd Qu.:93.33 3rd Qu.:86.67
[ reached getOption("max.print") -- omitted 1 row ]
str(sensitivity)'data.frame': 2 obs. of 15 variables:
$ GBM.RUN1 : int 100 100
$ GLM.RUN1 : num 93.3 93.3
$ GAM.RUN1 : num 86.7 86.7
$ MAXENT.Tsuruoka.RUN1: int 100 100
$ RF.RUN1 : int 100 100
$ GBM.RUN2 : int 100 100
$ GLM.RUN2 : int 100 100
$ GAM.RUN2 : int 100 100
$ MAXENT.Tsuruoka.RUN2: int 100 100
$ RF.RUN2 : int 100 100
$ GBM.RUN3 : int 80 80
$ GLM.RUN3 : int 100 100
$ GAM.RUN3 : num 93.3 93.3
$ MAXENT.Tsuruoka.RUN3: num 93.3 93.3
$ RF.RUN3 : num 86.7 86.7
sensitivity #table of sensitivity by TSS or ROC per run per model GBM.RUN1 GLM.RUN1 GAM.RUN1 MAXENT.Tsuruoka.RUN1 RF.RUN1 GBM.RUN2
TSS 100 93.333 86.667 100 100 100
ROC 100 93.333 86.667 100 100 100
GLM.RUN2 GAM.RUN2 MAXENT.Tsuruoka.RUN2 RF.RUN2 GBM.RUN3 GLM.RUN3
TSS 100 100 100 100 80 100
ROC 100 100 100 100 80 100
GAM.RUN3 MAXENT.Tsuruoka.RUN3 RF.RUN3
TSS 93.333 93.333 86.667
ROC 93.333 93.333 86.667
sensTSS.plot <- ggplot(sensData, aes(x = model1, y = TSS, fill = model1)) +
geom_boxplot() + guides(fill = FALSE) + ggtitle("Sensitivity using TSS scores") +
theme(plot.subtitle = element_text(vjust = 1), plot.caption = element_text(vjust = 1),
panel.background = element_rect(fill = NA)) + labs(x = "Model Type",
y = "TSS (%)")
sensTSS.plot <- ggplotly(sensTSS.plot)
sensROC.plot <- ggplot(sensData, aes(x = model2, y = ROC, fill = model1)) +
geom_boxplot() + guides(fill = FALSE) + ggtitle("Sensitivity using ROC scores") +
theme(plot.subtitle = element_text(vjust = 1), plot.caption = element_text(vjust = 1),
panel.background = element_rect(fill = NA)) + labs(x = "Model Type",
y = "ROC (%)")
sensROC.plot <- ggplotly(sensROC.plot)
sensTSS.plotsensROC.plotDiscussion: How do the models compare by their sensitivity scores?
##### plot SPECIFICITY #####
# Proportion of ABSENCES correctly predicted (true negative rate)
specificity <- read.table("data/results/Specificity_All_Models")
summary(specificity) GBM.RUN1 GLM.RUN1 GAM.RUN1 MAXENT.Tsuruoka.RUN1
Min. :100 Min. :100 Mode:logical Min. :75
1st Qu.:100 1st Qu.:100 NA's:2 1st Qu.:75
Median :100 Median :100 Median :75
Mean :100 Mean :100 Mean :75
3rd Qu.:100 3rd Qu.:100 3rd Qu.:75
RF.RUN1 GBM.RUN2 GLM.RUN2 GAM.RUN2
Min. :100 Min. :100 Min. :50 Mode:logical
1st Qu.:100 1st Qu.:100 1st Qu.:50 NA's:2
Median :100 Median :100 Median :50
Mean :100 Mean :100 Mean :50
3rd Qu.:100 3rd Qu.:100 3rd Qu.:50
MAXENT.Tsuruoka.RUN2 RF.RUN2 GBM.RUN3 GLM.RUN3
Min. :75 Min. :100 Min. :100 Min. :100
1st Qu.:75 1st Qu.:100 1st Qu.:100 1st Qu.:100
Median :75 Median :100 Median :100 Median :100
Mean :75 Mean :100 Mean :100 Mean :100
3rd Qu.:75 3rd Qu.:100 3rd Qu.:100 3rd Qu.:100
GAM.RUN3 MAXENT.Tsuruoka.RUN3 RF.RUN3
Mode:logical Min. :100 Min. :100
NA's:2 1st Qu.:100 1st Qu.:100
Median :100 Median :100
Mean :100 Mean :100
3rd Qu.:100 3rd Qu.:100
[ reached getOption("max.print") -- omitted 1 row ]
str(specificity)'data.frame': 2 obs. of 15 variables:
$ GBM.RUN1 : int 100 100
$ GLM.RUN1 : int 100 100
$ GAM.RUN1 : logi NA NA
$ MAXENT.Tsuruoka.RUN1: int 75 75
$ RF.RUN1 : int 100 100
$ GBM.RUN2 : int 100 100
$ GLM.RUN2 : int 50 50
$ GAM.RUN2 : logi NA NA
$ MAXENT.Tsuruoka.RUN2: int 75 75
$ RF.RUN2 : int 100 100
$ GBM.RUN3 : int 100 100
$ GLM.RUN3 : int 100 100
$ GAM.RUN3 : logi NA NA
$ MAXENT.Tsuruoka.RUN3: int 100 100
$ RF.RUN3 : int 100 100
specificity #table of specificity by TSS or ROC per run per model GBM.RUN1 GLM.RUN1 GAM.RUN1 MAXENT.Tsuruoka.RUN1 RF.RUN1 GBM.RUN2
TSS 100 100 NA 75 100 100
ROC 100 100 NA 75 100 100
GLM.RUN2 GAM.RUN2 MAXENT.Tsuruoka.RUN2 RF.RUN2 GBM.RUN3 GLM.RUN3
TSS 50 NA 75 100 100 100
ROC 50 NA 75 100 100 100
GAM.RUN3 MAXENT.Tsuruoka.RUN3 RF.RUN3
TSS NA 100 100
ROC NA 100 100
specTSS <- specificity[1, ]
specTran <- t(specTSS)
modelNames <- rownames(specTran)
model1 <- substr(modelNames, 1, 3)
specPlot1 <- data.frame(specTran, model1)
specROC <- specificity[2, ]
specTran <- t(specROC)
modelNames <- rownames(specTran)
model2 <- substr(modelNames, 1, 3)
specPlot2 <- data.frame(specTran, model2)
specData <- cbind(specPlot1, specPlot2)Let’s take a look at our plots for specificity:
specTSS.plot <- ggplot(specData, aes(x = model1, y = TSS, fill = model1)) +
geom_boxplot() + guides(fill = FALSE) + ggtitle("Specificity using TSS scores") +
theme(plot.subtitle = element_text(vjust = 1), plot.caption = element_text(vjust = 1),
panel.background = element_rect(fill = NA)) + labs(x = "Model Type",
y = "TSS (%)")
specROC.plot <- ggplot(specData, aes(x = model2, y = ROC, fill = model1)) +
geom_boxplot() + guides(fill = FALSE) + ggtitle("Specificity using ROC scores") +
theme(plot.subtitle = element_text(vjust = 1), plot.caption = element_text(vjust = 1),
panel.background = element_rect(fill = NA)) + labs(x = "Model Type",
y = "ROC (%)")
# Make them interactive!
specTSS.plot <- ggplotly(specTSS.plot)
specROC.plot <- ggplotly(specROC.plot)
# Plot them
specTSS.plotspecROC.plotWe can also print the variable importance for all models
# print variable importances
get_variables_importance(myBiomodModelOut), , RUN1, AllData
GBM GLM GAM MAXENT.Tsuruoka RF
bio_1 0.002 0.000 NA 0.407 0.052
bio_4 0.106 0.095 NA 0.640 0.136
bio_8 0.039 0.000 NA 0.334 0.028
bio_12 0.012 0.468 NA 0.170 0.014
bio_15 0.018 0.648 NA 0.448 0.034
bio_18 0.078 0.650 NA 0.843 0.014
, , RUN2, AllData
GBM GLM GAM MAXENT.Tsuruoka RF
bio_1 0.005 0.000 NA 0.348 0.035
bio_4 0.262 0.204 NA 0.736 0.317
bio_8 0.006 0.000 NA 0.496 0.011
bio_12 0.010 0.439 NA 0.350 0.013
bio_15 0.019 0.648 NA 0.472 0.015
bio_18 0.066 0.706 NA 0.513 0.021
, , RUN3, AllData
GBM GLM GAM MAXENT.Tsuruoka RF
bio_1 0.013 0.000 NA 0.358 0.048
bio_4 0.059 0.412 NA 0.694 0.169
bio_8 0.011 0.000 NA 0.851 0.017
This is another common way of saving these variables in disk for later use.
### save models evaluation scores and variables importance on hard drive
capture.output(get_evaluations(myBiomodModelOut), file = file.path(myRespName,
paste(myRespName, "_formal_models_eval.txt", sep = "")))
capture.output(get_variables_importance(myBiomodModelOut), file = file.path(myRespName,
paste(myRespName, "_formal_models_var_imp.txt", sep = "")))Finally, we can also see a few response plots
# 4. Plot response curves
# 4.1 Load the models for which we want to extract the predicted response
# curves
myGLMs <- BIOMOD_LoadModels(myBiomodModelOut, models = c("GLM"))
# 4.2 plot 2D response plots
myRespPlot2D <- response.plot2(models = myGLMs, Data = get_formal_data(myBiomodModelOut,
"expl.var"), show.variables = get_formal_data(myBiomodModelOut, "expl.var.names"),
do.bivariate = FALSE, fixed.var.metric = "median", col = c("blue", "red"),
legend = TRUE, data_species = get_formal_data(myBiomodModelOut, "resp.var"))
# 4.2 plot 3D response plots here only for a lone model
myRespPlot3D <- response.plot2(models = myGLMs[1], Data = get_formal_data(myBiomodModelOut,
"expl.var"), show.variables = get_formal_data(myBiomodModelOut, "expl.var.names"),
do.bivariate = TRUE, fixed.var.metric = "median", data_species = get_formal_data(myBiomodModelOut,
"resp.var"), display_title = FALSE)### all the values used to produce this plot are stored into the returned
### object you can redo plots by yourself and customised themProjecting your outputs
### Make projections on current variable
myBiomodProj <- BIOMOD_Projection(modeling.output = myBiomodModelOut, new.env = myExpl,
xy.new.env = myRespCoord, proj.name = "current", selected.models = "all",
binary.meth = "TSS", compress = TRUE, clamping.mask = F, output.format = ".RData")
-=-=-=-=-=-=-=-=-=-=-=-=-= Do Models Projections -=-=-=-=-=-=-=-=-=-=-=-=-=
> Building clamping mask
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
*** in setMethod('BinaryTransformation', signature(data='numeric')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
> Projecting Panthera.onca_AllData_RUN1_GBM ...
> Projecting Panthera.onca_AllData_RUN1_GLM ...
# Plot your predictions like this
plot(myBiomodProj)
myCurrentProj <- get_predictions(myBiomodProj)
myCurrentProj, , RUN1, AllData
GBM GLM MAXENT.Tsuruoka RF
[1,] 859 1000 1000 938
[2,] 938 1000 1000 988
[3,] 949 1000 1000 998
[4,] 904 1000 1000 994
[5,] 905 1000 1000 992
[6,] 920 1000 1000 992
[7,] 959 1000 1000 1000
[8,] 865 1000 1000 924
[9,] 938 1000 1000 996
[10,] 937 1000 1000 968
[11,] 950 1000 1000 998
[12,] 884 1000 0 984
[13,] 898 1000 1000 990
[14,] 865 1000 1000 904
[15,] 937 1000 1000 858
[16,] 937 1000 1000 874
[17,] 942 1000 1000 1000
[18,] 908 1000 1000 992
[ reached getOption("max.print") -- omitted 45 row(s) and 2 matrix slice(s) ]
Ensemble of forecasting
One powerful feature from biomod2comes with BIOMOD_EnsembleModeling, which combines individual models to build an ensemble of forecast. In the following example, we exclude all models having a TSS score lower than 0.5.
-=-=-=-=-=-=-=-=-=-=-=-=-= Build Ensemble Models -=-=-=-=-=-=-=-=-=-=-=-=-=
! all models available will be included in ensemble.modeling
> Evaluation & Weighting methods summary :
TSS over 0.7
ROC over 0.5
> mergedAlgo_mergedRun_mergedData ensemble modeling
! Models projections for whole zonation required...
> Projecting Panthera.onca_AllData_RUN1_GBM ...
> Projecting Panthera.onca_AllData_RUN1_GLM ...
> Projecting Panthera.onca_AllData_RUN1_RF ...
> Projecting Panthera.onca_AllData_RUN2_GBM ...
-=-=-=-=-=-=-=-=-=-=-= Do Ensemble Models Projections -=-=-=-=-=-=-=-=-=-=-=
> Projecting Panthera.onca_EMmeanByTSS_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMcvByTSS_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMciInfByTSS_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMciSupByTSS_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMmedianByTSS_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMcaByTSS_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
*** in setMethod('BinaryTransformation', signature(data='data.frame')
> Projecting Panthera.onca_EMwmeanByTSS_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMmeanByROC_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_GLM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMcvByROC_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_GLM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMciInfByROC_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_GLM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMciSupByROC_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_GLM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMmedianByROC_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_GLM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Projecting Panthera.onca_EMcaByROC_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_GLM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
*** in setMethod('BinaryTransformation', signature(data='data.frame')
> Projecting Panthera.onca_EMwmeanByROC_mergedAlgo_mergedRun_mergedData ...
*** here!!
'data.frame': 63 obs. of 12 variables:
$ Panthera.onca_AllData_RUN1_GBM : num 859 938 949 904 905 920 959 865 938 937 ...
$ Panthera.onca_AllData_RUN1_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN1_RF : num 938 988 998 994 992 992 1000 924 996 968 ...
$ Panthera.onca_AllData_RUN2_GBM : num 911 907 932 937 936 943 962 905 950 941 ...
$ Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_RF : num 972 982 996 990 990 1000 1000 970 996 1000 ...
$ Panthera.onca_AllData_RUN3_GBM : num 875 939 949 936 937 942 958 917 947 944 ...
$ Panthera.onca_AllData_RUN3_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN3_RF : num 888 984 1000 994 990 998 1000 988 996 998 ...
$ Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ Panthera.onca_AllData_RUN2_GLM : num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
NULL
***
Panthera.onca_AllData_RUN1_GBM Panthera.onca_AllData_RUN1_GLM Panthera.onca_AllData_RUN1_MAXENT.Tsuruoka Panthera.onca_AllData_RUN1_RF Panthera.onca_AllData_RUN2_GBM Panthera.onca_AllData_RUN2_GLM Panthera.onca_AllData_RUN2_MAXENT.Tsuruoka Panthera.onca_AllData_RUN2_RF Panthera.onca_AllData_RUN3_GBM Panthera.onca_AllData_RUN3_GLM Panthera.onca_AllData_RUN3_MAXENT.Tsuruoka Panthera.onca_AllData_RUN3_RF
*** here!!
> Building TSS binaries
*** in setMethod('BinaryTransformation', signature(data='matrix')
*** in setMethod('BinaryTransformation', signature(data='data.frame')
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Done -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
# Plot your results
plot(myBiomodEF)Multi-species modelling
Now, that we know how to do single species modelling, what changes if we want to model the distribution of n species?
biomod2 has no function that allows multi-species to be modelled. But, we can use the help of other R functions to deal with this! For example, we can create a for loop that will execute the modelling parts for each species within our species data frame. See below, (but do not execute it yet!)
# define the species of interest
sp.names <- names(DataSpecies)[1:5]
for (sp.n in sp.names) {
myRespName = sp.n
cat("\n", myRespName, "modelling...")
### definition of data i.e keep only the column of our species
myResp <- as.numeric(DataSpecies[, myRespName])
myRespCoord = myRespCoord
### Initialisation
myBiomodData <- BIOMOD_FormatingData(resp.var = myResp, expl.var = myExpl,
resp.xy = myRespCoord, resp.name = myRespName)
### Options definition
myBiomodOption <- BIOMOD_ModelingOptions()
### Modelling
myBiomodModelOut <- BIOMOD_Modeling(myBiomodData, models = c("GBM", "GLM",
"MAXENT.Tsuruoka", "RF"), models.options = myBiomodOption, NbRunEval = 3,
DataSplit = 80, Prevalence = 0.5, VarImport = 3, models.eval.meth = c("TSS",
"ROC"), SaveObj = TRUE, rescal.all.models = FALSE, do.full.models = FALSE,
modeling.id = paste(myRespName, "CurrentClim", sep = ""))
### save models evaluation scores and variables importance on hard drive
capture.output(get_evaluations(myBiomodModelOut), file = file.path(myRespName,
paste(myRespName, "_formal_models_eval.txt", sep = "")))
capture.output(get_variables_importance(myBiomodModelOut), file = file.path(myRespName,
paste(myRespName, "_formal_models_var_imp.txt", sep = "")))
### Building ensemble-models
myBiomodEM <- BIOMOD_EnsembleModeling(modeling.output = myBiomodModelOut,
chosen.models = "all", em.by = "all", eval.metric = "all", eval.metric.quality.threshold = c(0.7),
prob.mean = T, prob.cv = T, prob.ci = T, prob.ci.alpha = 0.05, prob.median = T,
committee.averaging = T, prob.mean.weight = T, prob.mean.weight.decay = "proportional")
### Make projections on current variable
myBiomodProj <- BIOMOD_Projection(modeling.output = myBiomodModelOut, new.env = myExpl,
xy.new.env = myRespCoord, proj.name = "current", selected.models = "all",
binary.meth = "TSS", compress = TRUE, clamping.mask = F, output.format = ".RData")
### Make ensemble-models projections on current variable
myBiomodEF <- BIOMOD_EnsembleForecasting(projection.output = myBiomodProj,
binary.meth = "TSS", total.consensus = TRUE, EM.output = myBiomodEM)
}Another (faster, and perhaps, more elegant) way for doing multi-species SDM using biomod2, is to use a apply family function. Like this one:
Panthera.onca modelling...
-=-=-=-=-=-=-=-=-=-=-= Panthera.onca Data Formating -=-=-=-=-=-=-=-=-=-=-=
> No pseudo absences selection !
! No data has been set aside for modeling evaluation
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Done -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Loading required library...
Checking Models arguments...
Creating suitable Workdir...
> Automatic weights creation to rise a 0.5 prevalence
-=-=-=-=-=-=-=-=-=-=-= Panthera.onca Modeling Summary -=-=-=-=-=-=-=-=-=-=-=
6 environmental variables ( bio_1 bio_4 bio_8 bio_12 bio_15 bio_18 )
Number of evaluation repetitions : 3
Models selected : RF MAXENT.Tsuruoka
Total number of model runs : 6
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-=-=-=- Run : Panthera.onca_AllData
-=-=-=--=-=-=- Panthera.onca_AllData_RUN1
Model=Breiman and Cutler's random forests for classification and regression
Evaluating Model stuff...
Evaluating Predictor Contributions...
Produce a richness map
### Make ensemble-models projections on current variable load the first speces
### binary maps which will define the mask
alphaMap <- reclassify(subset(myExpl, 1), c(-Inf, Inf, 0))
alphaMap <- get(load(paste(getwd(), "/", myRespName[1], "/proj_current/", "proj_current_",
myRespName[1], "_ensemble_TSSbin.RData", sep = "")))[, 1]